Search results for "Matrix multiplication"

showing 10 items of 20 documents

FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures

2020

Deep Learning is ubiquitous in a wide field of applications ranging from research to industry. In comparison to time-consuming iterative training of convolutional neural networks (CNNs), inference is a relatively lightweight operation making it amenable to execution on mobile devices. Nevertheless, lower latency and higher computation efficiency are crucial to allow for complex models and prolonged battery life. Addressing the aforementioned challenges, we propose FeatherCNN – a fast inference library for ARM CPUs – targeting the performance ceiling of mobile devices. FeatherCNN employs three key techniques: 1) A highly efficient TensorGEMM (generalized matrix multiplication) routine is app…

020203 distributed computingSource codeIterative methodComputer sciencebusiness.industrymedia_common.quotation_subjectDeep learningInference02 engineering and technologyParallel computingConvolutional neural networkMatrix multiplicationARM architectureComputational Theory and MathematicsHardware and ArchitectureSignal Processing0202 electrical engineering electronic engineering information engineeringArtificial intelligencebusinessmedia_commonIEEE Transactions on Parallel and Distributed Systems
researchProduct

Reverse-safe data structures for text indexing

2021

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm which constructs a z-reverse-safe data structure that has size O(n) and answers pattern matching queries of length at most d optim…

050101 languages & linguisticsComputer sciencedata structure02 engineering and technologyprivacySet (abstract data type)combinatoric0202 electrical engineering electronic engineering information engineering0501 psychology and cognitive sciencesPattern matchingSettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazionialgorithmSettore INF/01 - Informatica05 social sciencesSearch engine indexingINF/01 - INFORMATICAdata miningData structureMatrix multiplicationcombinatoricsExponent020201 artificial intelligence & image processingdata structure; algorithm; combinatorics; de Bruijn graph; data mining; privacyAlgorithmAdversary modelde Bruijn graphInteger (computer science)
researchProduct

Generalized centro-invertible matrices with applications

2014

Centro-invertible matrices are introduced by R.S. Wikramaratna in 2008. For an involutory matrix R, we define the generalized centro-invertible matrices with respect to R to be those matrices A such that RAR = A^−1. We apply these matrices to a problem in modular arithmetic. Specifically, algorithms for image blurring/deblurring are designed by means of generalized centro-invertible matrices. In addition, if R1 and R2 are n × n involutory matrices, then there is a simple bijection between the set of all centro-invertible matrices with respect to R1 and the set with respect to R2.

Centro-symmetric matrixSquare root of a 2 by 2 matrixApplied MathematicsInvolutory matrixINGENIERIA TELEMATICAMatrius (Matemàtica)Matrix ringMatrix multiplicationCombinatoricsMatrix (mathematics)Integer matrix2 × 2 real matricesCentro-invertible matrixMatrix analysisInvolutory matrixMATEMATICA APLICADAComputer Science::Distributed Parallel and Cluster ComputingMathematics
researchProduct

Fast Matrix Multiplication

2015

Until a few years ago, the fastest known matrix multiplication algorithm, due to Coppersmith and Winograd (1990), ran in time O(n2.3755). Recently, a surge of activity by Stothers, Vassilevska-Williams, and Le~Gall has led to an improved algorithm running in time O(n2.3729). These algorithms are obtained by analyzing higher and higher tensor powers of a certain identity of Coppersmith and Winograd. We show that this exact approach cannot result in an algorithm with running time O(n2.3725), and identify a wide class of variants of this approach which cannot result in an algorithm with running time $O(n^{2.3078}); in particular, this approach cannot prove the conjecture that for every e > 0, …

Class (set theory)Conjecturepeople.profession0102 computer and information sciences02 engineering and technology01 natural sciencesIdentity (music)Matrix multiplicationRunning timeCombinatorics010201 computation theory & mathematicsTensor (intrinsic definition)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingCoppersmithpeopleMathematicsCoppersmith–Winograd algorithmProceedings of the forty-seventh annual ACM symposium on Theory of Computing
researchProduct

An Scalable matrix computing unit architecture for FPGA and SCUMO user design interface

2019

High dimensional matrix algebra is essential in numerous signal processing and machine learning algorithms. This work describes a scalable square matrix-computing unit designed on the basis of circulant matrices. It optimizes data flow for the computation of any sequence of matrix operations removing the need for data movement for intermediate results, together with the individual matrix operations’ performance in direct or transposed form (the transpose matrix operation only requires a data addressing modification). The allowed matrix operations are: matrix-by-matrix addition, subtraction, dot product and multiplication, matrix-by-vector multiplication, and matrix by scalar multiplication.…

Computer Networks and CommunicationsComputer scienceMathematicsofComputing_NUMERICALANALYSISSistemes informàticslcsh:TK7800-836002 engineering and technologyScalar multiplicationComputational scienceMatrix (mathematics)matrix-computing unitTranspose0202 electrical engineering electronic engineering information engineeringmatrix processorElectrical and Electronic EngineeringCirculant matrixcirculant matricesFPGA020208 electrical & electronic engineeringlcsh:ElectronicsDot productMatrix multiplicationArquitectura d'ordinadorsHardware and ArchitectureControl and Systems Engineeringmatrix arithmeticSignal Processing020201 artificial intelligence & image processingMultiplicationhardware implementation
researchProduct

Versatile Direct and Transpose Matrix Multiplication with Chained Operations: An Optimized Architecture Using Circulant Matrices

2016

With growing demands in real-time control, classification or prediction, algorithms become more complex while low power and small size devices are required. Matrix multiplication (direct or transpose) is common for such computation algorithms. In numerous algorithms, it is also required to perform matrix multiplication repeatedly, where the result of a multiplication is further multiplied again. This work describes a versatile computation procedure and architecture: one of the matrices is stored in internal memory in its circulant form, then, a sequence of direct or transpose multiplications can be performed without timing penalty. The architecture proposes a RAM-ALU block for each matrix c…

Cycles per instructionBlock matrix020206 networking & telecommunications02 engineering and technologyParallel computingMatrix chain multiplicationMatrix multiplication020202 computer hardware & architectureTheoretical Computer ScienceMatrix (mathematics)Computational Theory and MathematicsHardware and ArchitectureTranspose0202 electrical engineering electronic engineering information engineeringMultiplicationHardware_ARITHMETICANDLOGICSTRUCTURESArithmeticCirculant matrixSoftwareMathematicsIEEE Transactions on Computers
researchProduct

Reverse-Safe Text Indexing

2021

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z - reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D . The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z , we propose an algorithm that constructs a z -reverse-safe data structure ( z -RSDS) that has size O(n) and answers decision and counting pattern matc…

Data structuresComputer scienceSuffix treesuffix tree0102 computer and information sciences02 engineering and technologytext indexing01 natural sciencesTheoretical Computer Sciencelaw.inventionSet (abstract data type)law020204 information systems0202 electrical engineering electronic engineering information engineeringPattern matchingdata privacySettore INF/01 - InformaticaSearch engine indexingdata privacy; Data structures; pattern matching; suffix tree; text indexingData structureMatrix multiplicationpattern matching010201 computation theory & mathematicsData structureAlgorithmAdversary modelInteger (computer science)ACM Journal of Experimental Algorithmics
researchProduct

A Practical Introduction to Tensor Networks: Matrix Product States and Projected Entangled Pair States

2013

This is a partly non-technical introduction to selected topics on tensor network methods, based on several lectures and introductory seminars given on the subject. It should be a good place for newcomers to get familiarized with some of the key ideas in the field, specially regarding the numerics. After a very general introduction we motivate the concept of tensor network and provide several examples. We then move on to explain some basics about Matrix Product States (MPS) and Projected Entangled Pair States (PEPS). Selected details on some of the associated numerical methods for 1d and 2d quantum lattice systems are also discussed.

High Energy Physics - TheoryQuantum PhysicsStrongly Correlated Electrons (cond-mat.str-el)Computer scienceNumerical analysisHigh Energy Physics - Lattice (hep-lat)FOS: Physical sciencesGeneral Physics and AstronomyMatrix multiplicationAlgebraCondensed Matter - Strongly Correlated ElectronsHigh Energy Physics - LatticeHigh Energy Physics - Theory (hep-th)Lattice (order)Quantum Physics (quant-ph)Quantum
researchProduct

Fast Approximated Discriminative Common Vectors Using Rank-One SVD Updates

2013

An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. The proposal consists of a rank-one update along with an adaptive restriction on the rank of the null space which leads to an approximate but convenient solution. The algorithm can be implemented very efficiently in terms of matrix operations and space complexity, which enables its use in large-scale dynamic application domains. Deep comparative experimentation using publicly available high dimensional image datasets has been carried out in order to properly assess the proposed algorithm against several recent incremental formulations.

Kernel (linear algebra)Discriminative modelRank (linear algebra)Computer scienceDimensionality reductionSingular value decompositionSpace (mathematics)AlgorithmMatrix multiplicationImage (mathematics)
researchProduct

A recurrence-free variant of strassen’s algorithm on hypercube

1995

In this paper a non-recursive Strassen’s matrix multiplication algorithm is presented. This new algorithm is suitable to run on parallel environments. Two computational schemes have been worked out exploiting different parallel approaches on hypercube architecture. A comparative analysis is reported. The experiments have been carried out on an nCUBE-2 supercomputer, housed at CNUCE in Pisa, supporting the Express parallel operating system. © 1995, Taylor & Francis Group, LLC. All rights reserved.

Matrix multiplicationGeneral Computer ScienceComputer scienceExpress operating systemComputer Science (all)Parallel computingStrassen’s algorithmSupercomputerMatrix multiplicationStrassen algorithmHypercube architectureHypercubeAlgorithmHypercube architecture
researchProduct